Goto

Collaborating Authors

 Eden Prairie


Scalable Unit Harmonization in Medical Informatics via Bayesian-Optimized Retrieval and Transformer-Based Re-ranking

de la Torre, Jordi

arXiv.org Artificial Intelligence

Objective: To develop and evaluate a scalable methodology for harmonizing inconsistent units in large-scale clinical datasets, addressing a key barrier to data interoperability. Materials and Methods: We designed a novel unit harmonization system combining BM25, sentence embeddings, Bayesian optimization, and a bidirectional transformer based binary classifier for retrieving and matching laboratory test entries. The system was evaluated using the Optum Clinformatics Datamart dataset (7.5 billion entries). We implemented a multi-stage pipeline: filtering, identification, harmonization proposal generation, automated re-ranking, and manual validation. Performance was assessed using Mean Reciprocal Rank (MRR) and other standard information retrieval metrics. Results: Our hybrid retrieval approach combining BM25 and sentence embeddings (MRR: 0.8833) significantly outperformed both lexical-only (MRR: 0.7985) and embedding-only (MRR: 0.5277) approaches. The transformer-based reranker further improved performance (absolute MRR improvement: 0.10), bringing the final system MRR to 0.9833. The system achieved 83.39\% precision at rank 1 and 94.66\% recall at rank 5. Discussion: The hybrid architecture effectively leverages the complementary strengths of lexical and semantic approaches. The reranker addresses cases where initial retrieval components make errors due to complex semantic relationships in medical terminology. Conclusion: Our framework provides an efficient, scalable solution for unit harmonization in clinical datasets, reducing manual effort while improving accuracy. Once harmonized, data can be reused seamlessly in different analyses, ensuring consistency across healthcare systems and enabling more reliable multi-institutional studies and meta-analyses.


Starkey's All-New Genesis AI Hearing Aids Receive Second Prestigious Accolade

#artificialintelligence

Eden Prairie, Minnesota, April 05, 2023 (GLOBE NEWSWIRE) -- Starkey is proud to announce its all-new Genesis AI hearing aids have received a Red Dot Award: Product Design 2023, marking the second award won by the completely redesigned hearing technology, just weeks after its launch. This is the seventh year Starkey has won this award, which is one of the most renowned international product competitions in the world. The annual awards program recognizes the year's best products that are aesthetically appealing, functional, innovative, and most importantly, have outstanding design. "At Starkey, product development begins by pushing the edge of what's possible," said President and CEO, Brandon Sawalich. "Five years ago, we set out to make the impossible possible when we began to imagine our next-level product offering. Receiving this honor is a tribute to the amount of research and development we devoted to producing our all-new hearing technology, which is making a real impact on reducing the stigma around hearing aids."


BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning

Fayyaz, Mohsen, Aghazadeh, Ehsan, Modarressi, Ali, Pilehvar, Mohammad Taher, Yaghoobzadeh, Yadollah, Kahou, Samira Ebrahimi

arXiv.org Artificial Intelligence

Current pre-trained language models rely on large datasets for achieving state-of-the-art performance. However, past research has shown that not all examples in a dataset are equally important during training. In fact, it is sometimes possible to prune a considerable fraction of the training set while maintaining the test performance. Established on standard vision benchmarks, two gradient-based scoring metrics for finding important examples are GraNd and its estimated version, EL2N. In this work, we employ these two metrics for the first time in NLP. We demonstrate that these metrics need to be computed after at least one epoch of fine-tuning and they are not reliable in early steps. Furthermore, we show that by pruning a small portion of the examples with the highest GraNd/EL2N scores, we can not only preserve the test accuracy, but also surpass it. This paper details adjustments and implementation choices which enable GraNd and EL2N to be applied to NLP.


Organ Shape Sensing using Pneumatically Attachable Flexible Rails in Robotic-Assisted Laparoscopic Surgery

McDonald-Bowyer, Aoife, Dietsch, Solène, Dimitrakakis, Emmanouil, Coote, Joanna M, Lindenroth, Lukas, Stoyanov, Danail, Stilli, Agostino

arXiv.org Artificial Intelligence

In robotic-assisted partial nephrectomy, surgeons remove a part of a kidney often due to the presence of a mass. A drop-in ultrasound probe paired to a surgical robot is deployed to execute multiple swipes over the kidney surface to localise the mass and define the margins of resection. This sub-task is challenging and must be performed by a highly skilled surgeon. Automating this sub-task may reduce cognitive load for the surgeon and improve patient outcomes. The overall goal of this work is to autonomously move the ultrasound probe on the surface of the kidney taking advantage of the use of the Pneumatically Attachable Flexible (PAF) rail system, a soft robotic device used for organ scanning and repositioning. First, we integrate a shape-sensing optical fibre into the PAF rail system to evaluate the curvature of target organs in robotic-assisted laparoscopic surgery. Then, we investigate the impact of the stiffness of the material of the PAF rail on the curvature sensing accuracy, considering that soft targets are present in the surgical field. Finally, we use shape sensing to plan the trajectory of the da Vinci surgical robot paired with a drop-in ultrasound probe and autonomously generate an Ultrasound scan of a kidney phantom.


Deep Mind's AlphaZero Game Playing AI -- Reduces Compute Time, Cuts Costs & Saves Energy

#artificialintelligence

CIOs & CTOs, the MIT Technology Review reported on October 5, 2022 reported that "DeepMind's game-playing AI AlphaZero has beaten a 50-year-old record in computer science. DeepMind has used its board-game playing AI to discover a faster way to solve a fundamental math problem in computer science, beating a record that has stood for more than 50 years. The problem, matrix multiplication, is a crucial type of calculation at the heart of many different applications, from displaying images on a screen to simulating complex physics. It is also fundamental to machine learning itself. Speeding up this calculation could have a big impact on thousands of everyday computer tasks, cutting costs and saving energy."



Top 3 Machine Learning Certification and Training Programs for Career Growth

#artificialintelligence

Glassdoor estimates the average salary for a Machine Learning Engineer at $131,001 USD. Indeed lists 2091 openings with an averMachine Learning Engineer age nationwide salary of $131,276 USD. The San Francisco Bay Area is the high-end of the salary range at $193,485 with Eden Prairie, Minnesota at $106,780. ZipRecruiter calculates the average US Machine Learning Engineer salary at $130,530. Our first pick is the Machine Learning Engineer -- learn the data science and machine learning skills required to build and deploy machine learning models in production using Amazon SageMaker, Deep Learning Topics within Computer Vision and NLP, Developing Your First ML Workflow, Operationalizing Machine Learning Projects, and a Capstone Project -- Inventory Monitoring at Distribution Centers, Second, the Machine Learning with PyTorch Open Source Torch Library -- machine learning, and for deep learning specifically, are presented with an eye toward their comparison to PyTorch, scikit-learn library, similarity between PyTorch tensors and the arrays in NumPy or other vectorized numeric libraries,clustering with PyTorch, image classifiers, And third, AWS Certified Machine Learning -- AWS Machine Learning-Specialty (ML-S) Certification exam, AWS Exploratory Data Analysis covers topics including data visualization, descriptive statistics, and dimension reduction and includes information on relevant AWS services, Machine Learning Modeling.


A survey of statistical learning techniques as applied to inexpensive pediatric Obstructive Sleep Apnea data

Winn, Emily T., Vazquez, Marilyn, Loliencar, Prachi, Taipale, Kaisa, Wang, Xu, Heo, Giseon

arXiv.org Machine Learning

Obstructive sleep apnea (OSA), a form of sleep-disordered breathing characterized by recurrent episodes of partial or complete airway obstruction during sleep, is a serious health problem, affecting an estimated 1-5% of elementary school-aged children [9, 2]. Even mild forms of untreated pediatric OSA may cause high blood pressure, behavioral challenges, or impeded growth. Compared to adults, the symptoms of childhood-onset OSA are more varied and change continuously with development, making diagnosis a difficult challenge. The complexity of the data from surveys, biomedical measurements, 3D facial photos, and time-series data calls for state of the art techniques from mathematics and data science. Clinical data, including that considered in confirming or ruling out a diagnosis of pediatric OSA, consist of high-dimensional multi-mode data with mixtures of variables of disparate types (e.g., nominal and categorical data of different scales, interval data, time-to-event and longitudinal outcomes) also called mixed or noncommensurate data.


C. H. Robinson Uses Heuristics to Solve Rich Vehicle Routing Problems

Khodabandeh, Ehsan, Snyder, Lawrence V., Dennis, John, Hammond, Joshua, Wanless, Cody

arXiv.org Artificial Intelligence

We consider a wide family of vehicle routing problem variants with many complex and practical constraints, known as rich vehicle routing problems, which are faced on a daily basis by C.H. Robinson (CHR). Since CHR has many customers, each with distinct requirements, various routing problems with different objectives and constraints should be solved. We propose a set partitioning framework with a number of route generation algorithms, which have shown to be effective in solving a variety of different problems. The proposed algorithms have outperformed the existing technologies at CHR on 10 benchmark instances and since, have been embedded into the company's transportation planning and execution technology platform.


Bayesian Spectral Deconvolution Based on Poisson Distribution: Bayesian Measurement and Virtual Measurement Analytics (VMA)

Nagata, Kenji, Mototake, Yoh-ichi, Muraoka, Rei, Sasaki, Takehiko, Okada, Masato

arXiv.org Machine Learning

In this paper, we propose a new method of Bayesian measurement for spectral deconvolution, which regresses spectral data into the sum of unimodal basis function such as Gaussian or Lorentzian functions. Bayesian measurement is a framework for considering not only the target physical model but also the measurement model as a probabilistic model, and enables us to estimate the parameter of a physical model with its confidence interval through a Bayesian posterior distribution given a measurement data set. The measurement with Poisson noise is one of the most effective system to apply our proposed method. Since the measurement time is strongly related to the signal-to-noise ratio for the Poisson noise model, Bayesian measurement with Poisson noise model enables us to clarify the relationship between the measurement time and the limit of estimation. In this study, we establish the probabilistic model with Poisson noise for spectral deconvolution. Bayesian measurement enables us to perform virtual and computer simulation for a certain measurement through the established probabilistic model. This property is called "Virtual Measurement Analytics(VMA)" in this paper. We also show that the relationship between the measurement time and the limit of estimation can be extracted by using the proposed method in a simulation of synthetic data and real data for XPS measurement of MoS$_2$.